Pesquisa | Portal Regional da BVS

1.

Information conveyed by voice qualitya).

Kreiman, Jody.

J Acoust Soc Am ; 155(2): 1264-1271, 2024 02 01.

Artigo em Inglês | MEDLINE | ID: mdl-38345424

RESUMO

The problem of characterizing voice quality has long caused debate and frustration. The richness of the available descriptive vocabulary is overwhelming, but the density and complexity of the information voices convey lead some to conclude that language can never adequately specify what we hear. Others argue that terminology lacks an empirical basis, so that language-based scales are inadequate a priori. Efforts to provide meaningful instrumental characterizations have also had limited success. Such measures may capture sound patterns but cannot at present explain what characteristics, intentions, or identity listeners attribute to the speaker based on those patterns. However, some terms continually reappear across studies. These terms align with acoustic dimensions accounting for variance across speakers and languages and correlate with size and arousal across species. This suggests that labels for quality rest on a bedrock of biology: We have evolved to perceive voices in terms of size/arousal, and these factors structure both voice acoustics and descriptive language. Such linkages could help integrate studies of signals and their meaning, producing a truly interdisciplinary approach to the study of voice.

Assuntos

Percepção da Fala , Acústica da Fala , Qualidade da Voz , Som , Audição

2.

Acoustic voice variation in spontaneous speech.

Lee, Yoonjeong; Kreiman, Jody.

J Acoust Soc Am ; 151(5): 3462, 2022 05.

Artigo em Inglês | MEDLINE | ID: mdl-35649890

RESUMO

This study replicates and extends the recent findings of Lee, Keating, and Kreiman [J. Acoust. Soc. Am. 146(3), 1568-1579 (2019)] on acoustic voice variation in read speech, which showed remarkably similar acoustic voice spaces for groups of female and male talkers and the individual talkers within these groups. Principal component analysis was applied to acoustic indices of voice quality measured from phone conversations for 99/100 of the same talkers studied previously. The acoustic voice spaces derived from spontaneous speech are highly similar to those based on read speech, except that unlike read speech, variability in fundamental frequency accounted for significant acoustic variability. Implications of these findings for prototype models of speaker recognition and discrimination are considered.

Assuntos

Fala , Voz , Acústica , Feminino , Humanos , Masculino , Acústica da Fala , Qualidade da Voz

3.

Speaker discrimination performance for "easy" versus "hard" voices in style-matched and -mismatched speech.

Afshan, Amber; Kreiman, Jody; Alwan, Abeer.

J Acoust Soc Am ; 151(2): 1393, 2022 02.

Artigo em Inglês | MEDLINE | ID: mdl-35232083

RESUMO

This study compares human speaker discrimination performance for read speech versus casual conversations and explores differences between unfamiliar voices that are "easy" versus "hard" to "tell together" versus "tell apart." Thirty listeners were asked whether pairs of short style-matched or -mismatched, text-independent utterances represented the same or different speakers. Listeners performed better when stimuli were style-matched, particularly in read speech-read speech trials (equal error rate, EER, of 6.96% versus 15.12% in conversation-conversation trials). In contrast, the EER was 20.68% for the style-mismatched condition. When styles were matched, listeners' confidence was higher when speakers were the same versus different; however, style variation caused decreases in listeners' confidence for the "same speaker" trials, suggesting a higher dependency of this task on within-speaker variability. The speakers who were "easy" or "hard" to "tell together" were not the same as those who were "easy" or "hard" to "tell apart." Analysis of speaker acoustic spaces suggested that the difference observed in human approaches to "same speaker" and "different speaker" tasks depends primarily on listeners' different perceptual strategies when dealing with within- versus between-speaker acoustic variability.

Assuntos

Percepção da Fala , Voz , Acústica , Humanos , Fala

4.

Effects of Laryngeal Vibratory Asymmetry and Neuromuscular Compensation on Voice Quality.

Pillutla, Pranati; Zhang, Zhaoyan; Kreiman, Jody; Wilhalme, Holly; Chhetri, Dinesh K.

Laryngoscope ; 132(1): 130-134, 2022 01.

Artigo em Inglês | MEDLINE | ID: mdl-34216152

RESUMO

INTRODUCTION: Vibratory asymmetry and neuromuscular compensation are often seen in laryngeal neuromuscular pathology. However, the ramifications of these findings on voice quality are unclear. This study investigated the effects of varying levels of vibratory asymmetry and neuromuscular compensation on cepstral peak prominence (CPP), an analog of voice quality. STUDY DESIGN: In vivo canine phonation model. METHODS: Varying degrees of vocal fold vibratory asymmetry were achieved by stimulating one recurrent laryngeal nerve (RLN) over 11 levels from threshold to maximal muscle activation. For each of these levels, phonation was induced at systematically varied combinations of neuromuscular compensation: three levels each of contralateral RLN stimulation (80%, 90%, and 100% of maximal), superior laryngeal nerve (SLN) activation (0%, 50%, and 100% of maximal), and airflow levels (500, 700, and 900 mL/s). Vocal fold symmetry was determined by assessing the opening phase of the vibratory cycle in high-speed video recordings. Voice quality was estimated acoustically by calculating CPP for each voice sample. RESULTS: Eight hundred twenty-two phonatory conditions with varying degrees of vibratory asymmetry were evaluated. CPP was highest at vibratory symmetry. Increasing levels of asymmetry resulted in significant decreases in CPP. CPP increased significantly with increasing contralateral RLN activation. CPP was significantly higher at 50% SLN activation than 0% or 100% SLN activation. CONCLUSION: Voice quality, as approximated by CPP, is best at vibratory symmetry and deteriorates with increasing degrees of asymmetry. Voice quality may be improved with neuromuscular compensation by increased adduction of the contralateral vocal fold or increased vocal fold tension at mid-levels of SLN activation. LEVEL OF EVIDENCE: NA, Basic Science Laryngoscope, 132:130-134, 2022.

Assuntos

Músculos Laríngeos/anatomia & histologia , Nervos Laríngeos/anatomia & histologia , Laringe/anatomia & histologia , Qualidade da Voz/fisiologia , Animais , Cães , Músculos Laríngeos/fisiologia , Nervos Laríngeos/fisiologia , Laringe/fisiologia , Masculino , Vibração

5.

Perceptual Evaluation of Vocal Fold Vibratory Asymmetry.

Azar, Shaghauyegh S; Pillutla, Pranati; Evans, Lauran K; Zhang, Zhaoyan; Kreiman, Jody; Chhetri, Dinesh K.

Laryngoscope ; 131(12): 2740-2746, 2021 12.

Artigo em Inglês | MEDLINE | ID: mdl-34106487

RESUMO

OBJECTIVES: Laryngeal vibratory asymmetry occurring with paresis may result in a perceptually normal or abnormal voice. The present study aims to determine the relationships between the degree of vibratory asymmetry, acoustic measures, and perception of sound stimuli. STUDY DESIGN: Animal Model of Voice Production, Perceptual Analysis of Voice. METHODS: In an in vivo canine model of phonation, symmetric and asymmetric laryngeal vibration were obtained via graded unilateral recurrent laryngeal nerve (RLN) stimulation simulating near paralysis to full activation. Phonation was performed at various contralateral RLN and bilateral superior laryngeal nerve stimulation levels. Naïve listeners rated the perceptual quality of 182 unique phonatory samples using a visual sort-and-rate task. Cepstral peak prominence (CPP) was calculated for each phonatory condition. The relationships among vibratory symmetry, CPP, and perceptual ratings were evaluated. RESULTS: A significant relationship emerged between RLN stimulation and perceptual rating, such that sound samples from low RLN levels were preferred to those from high RLN levels. When symmetric vibration was achieved at mid-RLN stimulation, listeners preferred samples from symmetric vibration over those from asymmetric vibration. However, when symmetry was achieved at high RLN levels, a strained voice quality resulted that listeners dispreferred over asymmetric conditions at lower RLN levels. CPP did not have a linear relationship with perceptual ratings. CONCLUSIONS: Laryngeal vibratory asymmetry produces variable perceptual differences in phonatory sound quality. Though CPP has been correlated with dysphonia in previous research, its complex relationship with quality limits its usefulness as clinical marker of voice quality perception. LEVEL OF EVIDENCE: NA, basic science Laryngoscope, 131:2740-2746, 2021.

Assuntos

Disfonia/fisiopatologia , Nervos Laríngeos/fisiopatologia , Paralisia das Pregas Vocais/complicações , Prega Vocal/fisiopatologia , Qualidade da Voz/fisiologia , Acústica , Animais , Modelos Animais de Doenças , Cães , Disfonia/diagnóstico , Estimulação Elétrica , Feminino , Humanos , Masculino , Fonação/fisiologia , Vibração , Paralisia das Pregas Vocais/fisiopatologia , Prega Vocal/inervação

6.

Validating a psychoacoustic model of voice quality.

Kreiman, Jody; Lee, Yoonjeong; Garellek, Marc; Samlan, Robin; Gerratt, Bruce R.

J Acoust Soc Am ; 149(1): 457, 2021 01.

Artigo em Inglês | MEDLINE | ID: mdl-33514179

RESUMO

No agreed-upon method currently exists for objective measurement of perceived voice quality. This paper describes validation of a psychoacoustic model designed to fill this gap. This model includes parameters to characterize the harmonic and inharmonic voice sources, vocal tract transfer function, fundamental frequency, and amplitude of the voice, which together serve to completely quantify the integral sound of a target voice sample. In experiment 1, 200 voices with and without diagnosed vocal pathology were fit with the model using analysis-by-synthesis. The resulting synthetic voice samples were not distinguishable from the original voice tokens, suggesting that the model has all the parameters it needs to fully quantify voice quality. In experiment 2 parameters that model the harmonic voice source were removed one by one, and the voice tokens were re-synthesized with the reduced model. In every case the lower-dimensional models provided worse perceptual matches to the quality of the natural tokens than did the original set, indicating that the psychoacoustic model cannot be reduced in dimensionality without loss of fit to the data. Results confirm that this model can be validly applied to quantify voice quality in clinical and research applications.

Assuntos

Psicoacústica , Distúrbios da Voz , Voz , Feminino , Humanos , Masculino , Fala , Acústica da Fala , Qualidade da Voz

7.

Vocal Fundamental Frequency and Sound Pressure Level in Charismatic Speech: A Cross-Gender and -Language Study.

Signorello, Rosario; Demolin, Didier; Henrich Bernardoni, Nathalie; Gerratt, Bruce R; Zhang, Zhaoyan; Kreiman, Jody.

J Voice ; 34(5): 808.e1-808.e13, 2020 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-31196689

RESUMO

OBJECTIVES/HYPOTHESES: Charismatic leaders use vocal behavior to persuade their audience, achieve goals, arouse emotional states, and convey personality traits and leadership status. This study investigates voice fundamental frequency (f0) and sound pressure level (SPL) in female and male French, Italian, Brazilian, and American politicians to determine which acoustic parameters are related to cross-gender and cross-cultural common vocal abilities, and which derive from culture-, gender-, and language-specific vocal strategies used to adapt vocal behavior to listeners' culture-related expectations. STUDY DESIGN: Speech corpora were collected for two formal communicative contexts (leaders address followers or other leaders) and one informal communicative context (dyadic interaction), based on the persuasive goals inherent in each context and on the relative status of the listeners and speakers. Leaders' acoustic voice profiles were created to show differences in f0 and SPL manipulation with respect to speakers' gender and language in each communicative context. RESULTS: Cross-gender and cross-language similarities in manipulation of average f0 and in f0 and SPL ranges occurred in all communicative contexts. Patterns of f0 manipulation were shared across genders and cultures, suggesting this dimension might be biologically based and is exploited by leaders to convey dominance. Ranges for f0 and SPL seemed to be affected by the communicative context, being wider or narrower depending on the persuasive goal. Results also showed language- and speaker-specific differences in the acoustic manipulation of f0 and SPL over time. CONCLUSIONS: These findings are consistent with the idea that specific charismatic leaders' vocal behaviors depend on a fine combination of vocal abilities that are shared across cultures and genders, combined with culturally- and linguistically-filtered vocal strategies.

Assuntos

Fala , Voz , Brasil , Feminino , Humanos , Idioma , Masculino , Som , Acústica da Fala

8.

Acoustic voice variation within and between speakers.

Lee, Yoonjeong; Keating, Patricia; Kreiman, Jody.

J Acoust Soc Am ; 146(3): 1568, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-31590565

RESUMO

Little is known about the nature or extent of everyday variability in voice quality. This paper describes a series of principal component analyses to explore within- and between-talker acoustic variation and the extent to which they conform to expectations derived from current models of voice perception. Based on studies of faces and cognitive models of speaker recognition, the authors hypothesized that a few measures would be important across speakers, but that much of within-speaker variability would be idiosyncratic. Analyses used multiple sentence productions from 50 female and 50 male speakers of English, recorded over three days. Twenty-six acoustic variables from a psychoacoustic model of voice quality were measured every 5 ms on vowels and approximants. Across speakers the balance between higher harmonic amplitudes and inharmonic energy in the voice accounted for the most variance (females = 20%, males = 22%). Formant frequencies and their variability accounted for an additional 12% of variance across speakers. Remaining variance appeared largely idiosyncratic, suggesting that the speaker-specific voice space is different for different people. Results further showed that voice spaces for individuals and for the population of talkers have very similar acoustic structures. Implications for prototype models of voice perception and recognition are discussed.

Assuntos

Variação Biológica Individual , Variação Biológica da População , Acústica da Fala , Voz/fisiologia , Adulto , Feminino , Humanos , Masculino , Fonética , Psicoacústica

9.

Towards understanding speaker discrimination abilities in humans and machines for text-independent short utterances of different speech styles.

Park, Soo Jin; Yeung, Gary; Vesselinova, Neda; Kreiman, Jody; Keating, Patricia A; Alwan, Abeer.

J Acoust Soc Am ; 144(1): 375, 2018 07.

Artigo em Inglês | MEDLINE | ID: mdl-30075658

RESUMO

Little is known about human and machine speaker discrimination ability when utterances are very short and the speaking style is variable. This study compares text-independent speaker discrimination ability of humans and machines based on utterances shorter than 2 s in two different speaking styles (read sentences and speech directed towards pets, characterized by exaggerated prosody). Recordings of 50 female speakers drawn from the UCLA Speaker Variability Database were used as stimuli. Performance of 65 human listeners was compared to i-vector-based automatic speaker verification systems using mel-frequency cepstral coefficients, voice quality features, which were inspired by a psychoacoustic model of voice perception, or their combination by score-level fusion. Humans always outperformed machines, except in the case of style-mismatched pairs from perceptually-marked speakers. Speaker representations by humans and machines were compared using multi-dimensional scaling (MDS). Canonical correlation analysis showed a weak correlation between machine and human MDS spaces. Multiple regression showed that means of voice quality features could represent the most important human MDS dimension well, but not the dimensions from machines. These results suggest that speaker representations by humans and machines are different, and machine performance might be improved by better understanding how different acoustic features relate to perceived speaker identity.

Assuntos

Acústica da Fala , Percepção da Fala/fisiologia , Fala/fisiologia , Voz/fisiologia , Adolescente , Adulto , Compreensão/fisiologia , Feminino , Humanos , Idioma , Masculino , Qualidade da Voz , Adulto Jovem

10.

Comparing Measures of Voice Quality From Sustained Phonation and Continuous Speech.

Gerratt, Bruce R; Kreiman, Jody; Garellek, Marc.

J Speech Lang Hear Res ; 59(5): 994-1001, 2016 10 01.

Artigo em Inglês | MEDLINE | ID: mdl-27626612

RESUMO

Purpose: The question of what type of utterance-a sustained vowel or continuous speech-is best for voice quality analysis has been extensively studied but with equivocal results. This study examines whether previously reported differences derive from the articulatory and prosodic factors occurring in continuous speech versus sustained phonation. Method: Speakers with voice disorders sustained vowels and read sentences. Vowel samples were excerpted from the steadiest portion of each vowel in the sentences. In addition to sustained and excerpted vowels, a 3rd set of stimuli was created by shortening sustained vowel productions to match the duration of vowels excerpted from continuous speech. Acoustic measures were made on the stimuli, and listeners judged the severity of vocal quality deviation. Results: Sustained vowels and those extracted from continuous speech contain essentially the same acoustic and perceptual information about vocal quality deviation. Conclusions: Perceived and/or measured differences between continuous speech and sustained vowels derive largely from voice source variability across segmental and prosodic contexts and not from variations in vocal fold vibration in the quasisteady portion of the vowels. Approaches to voice quality assessment by using continuous speech samples average across utterances and may not adequately quantify the variability they are intended to assess.

Assuntos

Fonação , Fala , Qualidade da Voz , Adulto , Análise de Variância , Feminino , Humanos , Masculino , Espectrografia do Som , Adulto Jovem

11.

On Peer Review.

Kreiman, Jody.

J Speech Lang Hear Res ; 59(3): 480-3, 2016 06 01.

Artigo em Inglês | MEDLINE | ID: mdl-27333021

RESUMO

PURPOSE: This letter briefly reviews ideas about the purpose and benefits of peer review and reaches some idealistic conclusions about the process. METHOD: The author uses both literature review and meditation born of long experience. RESULTS: From a cynical perspective, peer review constitutes an adversarial process featuring domination of the weak by the strong and exploitation of authors and reviewers by editors and publishers, resulting in suppression of new ideas, delayed publication of important research, and bad feelings ranging from confusion to fury. More optimistically, peer review can be viewed as a system in which reviewers and editors volunteer thousands of hours to work together with authors, to the end of furthering human knowledge. CONCLUSION: Editors and authors will encounter both peer-review cynics and idealists in their careers, but in the author's experience the second are far more prevalent. Reviewers and editors can help increase the positive benefits of peer review (and improve the culture of science) by viewing the system as one in which they work with authors on behalf of high-quality publications and better science. Authors can contribute by preparing papers carefully prior to submission and by interpreting reviewers' and editors' suggestions in this collegial spirit, however difficult this may be in some cases.

Assuntos

Revisão da Pesquisa por Pares , Comportamento Cooperativo , Humanos

12.

Impact of Vocal Tract Resonance on the Perception of Voice Quality Changes Caused by Varying Vocal Fold Stiffness.

Signorello, Rosario; Zhang, Zhaoyan; Gerratt, Bruce; Kreiman, Jody.

Acta Acust United Acust ; 102(2): 209-213, 2016.

Artigo em Inglês | MEDLINE | ID: mdl-27134616

RESUMO

Experiments using animal and human larynx models are often conducted without a vocal tract. While it is often assumed that the absence of a vocal tract has only small effects on vocal fold vibration, it is not actually known how sound production and quality are affected. In this study, the validity of using data obtained in the absence of a vocal tract for voice perception studies was investigated. Using a two-layer self-oscillating physical model, three series of voice stimuli were created: one produced with conditions of left-right symmetric vocal fold stiffness, and two with left-right asymmetries in vocal fold body stiffness. Each series included a set of stimuli created with a physical vocal tract, and a second set created without a physical vocal tract. Stimuli were re-synthesized to equalize the mean F0 for each series and normalized for amplitude. Listeners were asked to evaluate the three series in a sort-and-rate task. Multidimensional scaling analysis was applied to examine the perceptual interaction between the voice source and the vocal tract resonances. The results showed that the presence or absence of a vocal tract can significantly affect perception of voice quality changes due to parametric changes in vocal fold properties, except when the parametric changes in vocal fold properties produced an abrupt shift in vocal fold vibratory pattern resulting in a salient quality change.

13.

Modeling the voice source in terms of spectral slopes.

Garellek, Marc; Samlan, Robin; Gerratt, Bruce R; Kreiman, Jody.

J Acoust Soc Am ; 139(3): 1404-10, 2016 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-27036277

RESUMO

A psychoacoustic model of the voice source spectrum is proposed. The model is characterized by four spectral slope parameters: the difference in amplitude between the first two harmonics (H1-H2), the second and fourth harmonics (H2-H4), the fourth harmonic and the harmonic nearest 2 kHz in frequency (H4-2 kHz), and the harmonic nearest 2 kHz and that nearest 5 kHz (2 kHz-5 kHz). As a step toward model validation, experiments were conducted to establish the acoustic and perceptual independence of these parameters. In experiment 1, the model was fit to a large number of voice sources. Results showed that parameters are predictable from one another, but that these relationships are due to overall spectral roll-off. Two additional experiments addressed the perceptual independence of the source parameters. Listener sensitivity to H1-H2, H2-H4, and H4-2 kHz did not change as a function of the slope of an adjacent component, suggesting that sensitivity to these components is robust. Listener sensitivity to changes in spectral slope from 2 kHz to 5 kHz depended on complex interactions between spectral slope, spectral noise levels, and H4-2 kHz. It is concluded that the four parameters represent non-redundant acoustic and perceptual aspects of voice quality.

Assuntos

Acústica , Modelos Teóricos , Acústica da Fala , Qualidade da Voz , Adulto , Feminino , Humanos , Masculino , Psicoacústica , Espectrografia do Som , Medida da Produção da Fala , Adulto Jovem

14.

Perceptual evaluation of voice source models.

Kreiman, Jody; Garellek, Marc; Chen, Gang; Alwan, Abeer; Gerratt, Bruce R.

J Acoust Soc Am ; 138(1): 1-10, 2015 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-26233000

RESUMO

Models of the voice source differ in their fits to natural voices, but it is unclear which differences in fit are perceptually salient. This study examined the relationship between the fit of five voice source models to 40 natural voices, and the degree of perceptual match among stimuli synthesized with each of the modeled sources. Listeners completed a visual sort-and-rate task to compare versions of each voice created with the different source models, and the results were analyzed using multidimensional scaling. Neither fits to pulse shapes nor fits to landmark points on the pulses predicted observed differences in quality. Further, the source models fit the opening phase of the glottal pulses better than they fit the closing phase, but at the same time similarity in quality was better predicted by the timing and amplitude of the negative peak of the flow derivative (part of the closing phase) than by the timing and/or amplitude of peak glottal opening. Results indicate that simply knowing how (or how well) a particular source model fits or does not fit a target source pulse in the time domain provides little insight into what aspects of the voice source are important to listeners.

Assuntos

Percepção Auditiva/fisiologia , Qualidade da Voz/fisiologia , Estimulação Acústica , Adolescente , Adulto , Glote/fisiologia , Humanos , Pessoa de Meia-Idade , Modelos Biológicos , Localização de Som/fisiologia , Espectrografia do Som , Adulto Jovem

15.

Toward a consensus on symbolic notation of harmonics, resonances, and formants in vocalization.

Titze, Ingo R; Baken, Ronald J; Bozeman, Kenneth W; Granqvist, Svante; Henrich, Nathalie; Herbst, Christian T; Howard, David M; Hunter, Eric J; Kaelin, Dean; Kent, Raymond D; Kreiman, Jody; Kob, Malte; Löfqvist, Anders; McCoy, Scott; Miller, Donald G; Noé, Hubert; Scherer, Ronald C; Smith, John R; Story, Brad H; Svec, Jan G; Ternström, Sten; Wolfe, Joe.

J Acoust Soc Am ; 137(5): 3005-7, 2015 May.

Artigo em Inglês | MEDLINE | ID: mdl-25994732

Assuntos

Acústica , Linguística/classificação , Acústica da Fala , Patologia da Fala e Linguagem/classificação , Terminologia como Assunto , Vocalização Animal/classificação , Qualidade da Voz , Animais , Consenso , Humanos , Linguística/normas , Fonética , Som , Patologia da Fala e Linguagem/normas , Vibração

16.

Perceptual consequences of changes in epilaryngeal area and shape.

Samlan, Robin A; Kreiman, Jody.

J Acoust Soc Am ; 136(5): 2798-806, 2014 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-25373979

RESUMO

The influence of epilaryngeal area on glottal flow and the acoustic signal has been described [Titze, J. Acoust. Soc. Am. 123, 2733-2749 (2008)], but it is not known how (or whether) changes in epilaryngeal area influence perceived voice quality. This study examined these relationships in a kinematic vocal tract model. Epilaryngeal constrictions and expansions were simulated at the levels of the aryepiglottic folds and the ventricular folds in the context of four glottal configurations representing normal vibration to severe vocal fold paralysis, for the three corner vowels /a/, /i/, and /u/. Minimum and maximum glottal flow, maximum flow declination rate, spectral slope, cepstral peak prominence, and the harmonics-to-noise ratio were measured, and listeners completed a perceptual sort-and-rate task for all samples. Epilaryngeal constriction and expansion caused salient differences in voice quality. The location of constriction was also perceivable. Vowels simulated with aryepiglottic constriction demonstrated lower maximum airflow and less noise than the other epilaryngeal shapes, and listeners consistently perceived them as distinct from other stimuli. Acoustic differences decreased with increasing severity of simulated paralysis. Results of epilaryngeal constriction and expansion were similar for /a/ and /i/, and produced slightly different patterns for /u/.

Assuntos

Laringe/anatomia & histologia , Fonação/fisiologia , Acústica da Fala , Percepção da Fala , Adulto , Antropometria , Fenômenos Biomecânicos , Auxiliares de Comunicação para Pessoas com Deficiência , Simulação por Computador , Glote/fisiologia , Glote/ultraestrutura , Humanos , Laringe/patologia , Periodicidade , Fonética , Vibração , Paralisia das Pregas Vocais/fisiopatologia , Paralisia das Pregas Vocais/psicologia , Prega Vocal , Qualidade da Voz

17.

The glottaltopogram: a method of analyzing high-speed images of the vocal folds.

Chen, Gang; Kreiman, Jody; Alwan, Abeer.

Comput Speech Lang ; 28(5): 1156-1169, 2014 Sep 01.

Artigo em Inglês | MEDLINE | ID: mdl-25170187

RESUMO

Laryngeal high-speed videoendoscopy is a state-of-the-art technique to examine physiological vibrational patterns of the vocal folds. With sampling rates of thousands of frames per second, high-speed videoendoscopy produces a large amount of data that is difficult to analyze subjectively. In order to visualize high-speed video in a straightforward and intuitive way, many methods have been proposed to condense the three-dimensional data into a few static images that preserve characteristics of the underlying vocal fold vibratory patterns. In this paper, we propose the "glottaltopogram," which is based on principal component analysis of changes over time in the brightness of each pixel in consecutive video images. This method reveals the overall synchronization of the vibrational patterns of the vocal folds over the entire laryngeal area. Experimental results showed that this method is effective in visualizing pathological and normal vocal fold vibratory patterns.

18.

Toward a unified theory of voice production and perception.

Kreiman, Jody; Gerratt, Bruce R; Garellek, Marc; Samlan, Robin; Zhang, Zhaoyan.

Loquens ; 1(1)2014 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-27135054

RESUMO

At present, two important questions about voice remain unanswered: When voice quality changes, what physiological alteration caused this change, and if a change to the voice production system occurs, what change in perceived quality can be expected? We argue that these questions can only be answered by an integrated model of voice linking production and perception, and we describe steps towards the development of such a model. Preliminary evidence in support of this approach is also presented. We conclude that development of such a model should be a priority for scientists interested in voice, to explain what physical condition(s) might underlie a given voice quality, or what voice quality might result from a specific physical configuration.

19.

Development of a glottal area index that integrates glottal gap size and open quotient.

Chen, Gang; Kreiman, Jody; Gerratt, Bruce R; Neubauer, Juergen; Shue, Yen-Liang; Alwan, Abeer.

J Acoust Soc Am ; 133(3): 1656-66, 2013 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-23464035

RESUMO

Because voice signals result from vocal fold vibration, perceptually meaningful vibratory measures should quantify those aspects of vibration that correspond to differences in voice quality. In this study, glottal area waveforms were extracted from high-speed videoendoscopy of the vocal folds. Principal component analysis was applied to these waveforms to investigate the factors that vary with voice quality. Results showed that the first principal component derived from tokens without glottal gaps was significantly (p < 0.01) associated with the open quotient (OQ). The alternating-current (AC) measure had a significant effect (p < 0.01) on the first principal component among tokens exhibiting glottal gaps. A measure AC/OQ, defined as the ratio of AC to OQ, was proposed to combine both amplitude and temporal characteristics of the glottal area waveform for both complete and incomplete glottal closures. Analyses of "glide" phonations in which quality varied continuously from breathy to pressed showed that the AC/OQ measure was able to characterize the corresponding continuum of glottal area waveform variation, regardless of the presence or absence of glottal gaps.

Assuntos

Glote/anatomia & histologia , Glote/fisiologia , Fonação , Acústica da Fala , Qualidade da Voz , Fenômenos Biomecânicos , Feminino , Humanos , Laringoscopia , Modelos Lineares , Masculino , Análise de Componente Principal , Fatores de Tempo , Vibração , Gravação em Vídeo , Prega Vocal/anatomia & histologia , Prega Vocal/fisiologia

20.

Voice quality and tone identification in White Hmong.

Garellek, Marc; Keating, Patricia; Esposito, Christina M; Kreiman, Jody.

J Acoust Soc Am ; 133(2): 1078-89, 2013 Feb.

Artigo em Inglês | MEDLINE | ID: mdl-23363123

RESUMO

This study investigates the importance of source spectrum slopes in the perception of phonation by White Hmong listeners. In White Hmong, nonmodal phonation (breathy or creaky voice) accompanies certain lexical tones, but its importance in tonal contrasts is unclear. In this study, native listeners participated in two perceptual tasks, in which they were asked to identify the word they heard. In the first task, participants heard natural stimuli with manipulated F0 and duration (phonation unchanged). Results indicate that phonation is important in identifying the breathy tone, but not the creaky tone. Thus, breathiness can be viewed as contrastive in White Hmong. Next, to understand which parts of the source spectrum listeners use to perceive contrastive breathy phonation, source spectrum slopes were manipulated in the second task to create stimuli ranging from modal to breathy sounding, with F0 held constant. Results indicate that changes in H1-H2 (difference in amplitude between the first and second harmonics) and H2-H4 (difference in amplitude between the second and fourth harmonics) are independently important for distinguishing breathy from modal phonation, consistent with the view that the percept of breathiness is influenced by a steep drop in harmonic energy in the lower frequencies.

Assuntos

Fonação , Fonética , Percepção da Altura Sonora , Reconhecimento Psicológico , Qualidade da Voz , Estimulação Acústica , Adulto , Audiometria da Fala , Sinais (Psicologia) , Feminino , Humanos , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Psicoacústica , Espectrografia do Som , Fatores de Tempo

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA